FusedBatchNorm ================= 对输入张量执行融合批归一化(Fused Batch Normalization),在多核间拆分批量单元并行完成归一化与仿射变换。 .. math:: \hat{x}_{b,c} = \frac{x_{b,c} - mean_c}{\sqrt{variance_c + \epsilon}}, \quad y_{b,c} = scale_c \cdot \hat{x}_{b,c} + offset_c 输入: - **input** - 输入张量首地址,形状为 ``[unit, channel]``。 - **scale** - 缩放系数数组首地址,长度为 ``channel``。 - **offset** - 平移系数数组首地址,长度为 ``channel``。 - **mean** - 归一化均值数组首地址,长度为 ``channel``。 - **variance** - 归一化方差数组首地址,长度为 ``channel``。 - **epsilon** - 数值稳定项。 - **channel** - 通道数。 - **unit** - 归一化单元数量(批量大小 × 高 × 宽)。 - **core_mask(int, 可选)** - 核掩码(仅适用于共享存储版本)。 输出: - **output** - 写回融合批归一化计算结果的张量首地址。 支持平台: ``FT78NE`` ``MT7004`` .. note:: - FT78NE 支持 fp32 数据类型。 - MT7004 支持 fp16、fp32 数据类型。 **共享存储版本:** .. c:function:: void hp_fusedbatchnorm_s(const half *input, const half *scale, const half *offset, const half *mean, const half *variance, float epsilon, int channel, int unit, int core_mask, half *output) .. c:function:: void fp_fusedbatchnorm_s(const float *input, const float *scale, const float *offset, const float *mean, const float *variance, float epsilon, int channel, int unit, int core_mask, float *output) **C调用示例:** .. code-block:: c :linenos: :emphasize-lines: 15 // FT78NE 多核示例 #include int main(void) { const float *input = (const float *)0xA0000000; // DDR 存储 const float *scale = (const float *)0xB0000000; const float *offset = (const float *)0xB0001000; const float *mean = (const float *)0xB0002000; const float *variance = (const float *)0xB0003000; float *output = (float *)0xC0000000; int channel = 64; int unit = 1024; float epsilon = 1e-5f; int core_mask = 0xff; fp_fusedbatchnorm_s(input, scale, offset, mean, variance, epsilon, channel, unit, core_mask, output); return 0; } **私有存储版本:** .. c:function:: void hp_fusedbatchnorm_p(const half *input, const half *scale, const half *offset, const half *mean, const half *variance, float epsilon, int channel, int unit, half *output) .. c:function:: void fp_fusedbatchnorm_p(const float *input, const float *scale, const float *offset, const float *mean, const float *variance, float epsilon, int channel, int unit, float *output) **C调用示例:** .. code-block:: c :linenos: :emphasize-lines: 14 // MT7004 单核示例 #include int main(void) { const half *input = (const half *)0x10000000; // L2 存储 const half *scale = (const half *)0x10004000; const half *offset = (const half *)0x10008000; const half *mean = (const half *)0x1000C000; const half *variance = (const half *)0x10010000; half *output = (half *)0x10014000; int channel = 32; int unit = 512; float epsilon = 1e-4f; hp_fusedbatchnorm_p(input, scale, offset, mean, variance, epsilon, channel, unit, output); return 0; }